Github - https://github.com/hariomvyas/UMBC/blob/main/Data601/Project3/Vyas_Hariom_Project3.ipynb
Data Source - https://datadashboard.fda.gov/ora/cd/inspections.htm
Data Name – U.S. Food & Drug Administration - Inspections Details
Data was provided by Professor. It is free to download and legal to access. Documented EDA of Data is not available.
Data Size – Rows (222697), Columns (8), Size (9.09 MB)
Data Source - https://datadashboard.fda.gov/ora/cd/inspections.htm
Data Name – U.S. Food & Drug Administration - Inspections Citations Details
Data was provided by Professor. It is free to download and legal to access. Documented EDA of Data is not available.
Data Size – Rows (257624), Columns (14), Size (17.0 MB)
1. Loading dataset, Combining two datasets, and Cleaning.
2. Finding relationship between features of dataset
3. Presenting Hypothesis and Visulizations based on the it.
## Importing Required Libraries
import time
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.figure_factory as ff
%matplotlib inline
## Creating DataFrames by reading MS Excel Files
df1 = pd.read_excel("39c00379-17c1-4120-96b7-fa372e26f8f0.xlsx")
df2 = pd.read_excel("6481b945-7549-44e5-a65b-8b8eb8b92c87.xlsx")
## File1 Information
print(df1.shape, "\n")
print(df1.columns, "\n")
df1.info()
(222697, 8)
Index(['Inspection ID', 'FEI Number', 'Legal Name', 'Inspection End Date',
'Program Area', 'Act/CFR Number', 'Short Description',
'Long Description'],
dtype='object')
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 222697 entries, 0 to 222696
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Inspection ID 222697 non-null int64
1 FEI Number 222697 non-null int64
2 Legal Name 222697 non-null object
3 Inspection End Date 222697 non-null datetime64[ns]
4 Program Area 222697 non-null object
5 Act/CFR Number 222516 non-null object
6 Short Description 222697 non-null object
7 Long Description 222694 non-null object
dtypes: datetime64[ns](1), int64(2), object(5)
memory usage: 13.6+ MB
## File1 DataFrame
df1.head(15)
| Inspection ID | FEI Number | Legal Name | Inspection End Date | Program Area | Act/CFR Number | Short Description | Long Description | |
|---|---|---|---|---|---|---|---|---|
| 0 | 1167299 | 3017202975 | PAVmed Inc. | 2022-04-15 | Devices | 21 CFR 820.100(b) | Documentation | Corrective and preventive action activities an... |
| 1 | 1167299 | 3017202975 | PAVmed Inc. | 2022-04-15 | Devices | 21 CFR 820.198(a) | Lack of or inadequate complaint procedures | Procedures for receiving, reviewing, and evalu... |
| 2 | 1167002 | 3006463947 | Ximedica LLC | 2022-04-13 | Devices | 21 CFR 820.30(g) | Design validation - production units | The design was not validated under defined ope... |
| 3 | 1167016 | 3009492128 | Exogenesis Corporation | 2022-04-12 | Devices | 21 CFR 820.70(c) | Environmental control Lack of or inadequate p... | Procedures to control environmental conditions... |
| 4 | 1167016 | 3009492128 | Exogenesis Corporation | 2022-04-12 | Devices | 21 CFR 820.250(b) | Sampling plans | Sampling plans are not based on valid statisti... |
| 5 | 1167392 | 3004217244 | Trio Community Meals | 2022-04-11 | Foods | 21 CFR 123.6(b) | No HACCP plan | You do not have a written HACCP plan that outl... |
| 6 | 1167392 | 3004217244 | Trio Community Meals | 2022-04-11 | Foods | 21 CFR 123.9(b)(1) | Record retention | Your monitoring records are not maintained at ... |
| 7 | 1167392 | 3004217244 | Trio Community Meals | 2022-04-11 | Foods | 21 CFR 123.11(c) | Sanitation Records | You are not maintaining sanitation control rec... |
| 8 | 1166559 | 3006468808 | Metro-Pack, Inc. | 2022-04-07 | Foods | 21 CFR 111.15(d)(2) | Pest control measures | You did not take effective measures to exclude... |
| 9 | 1166559 | 3006468808 | Metro-Pack, Inc. | 2022-04-07 | Foods | 21 CFR 111.475(b)(1) | Written procedures - holding; distributing | You did not make and keep written procedures f... |
| 10 | 1166739 | 3007335286 | Kiryas Yoel World Famous Kosher Bakery, Inc. | 2022-04-07 | Foods | 21 CFR 117.20(b) | Plant Construction and Design | Your plant was not designed to facilitate main... |
| 11 | 1166739 | 3007335286 | Kiryas Yoel World Famous Kosher Bakery, Inc. | 2022-04-07 | Foods | 21 CFR 117.35(d) | Sanitation of food contact surfaces - frequency | You did not clean and sanitize your utensils o... |
| 12 | 1166739 | 3007335286 | Kiryas Yoel World Famous Kosher Bakery, Inc. | 2022-04-07 | Foods | 21 CFR 117.40 | Equipment and Utensils - Design and Maintenance | Your equipment and utensils were not designed ... |
| 13 | 1166435 | 3010610364 | Flushing Meat Provision Corp. | 2022-04-06 | Foods | 21 CFR 117.35(c) | Pest Control | You did not exclude pests from your food plant... |
| 14 | 1166444 | 3009519707 | Revolutionary Science | 2022-04-06 | Devices | 21 CFR 803.17(a)(1) | Lack of System for Event Evaluations | The written MDR Procedure does not include an ... |
## File2 Information
print(df2.shape, "\n")
print(df2.columns, "\n")
df2.info()
(257624, 14)
Index(['FEI Number', 'Legal Name', 'City', 'State', 'Zip', 'Country/Area',
'Fiscal Year', 'Inspection ID', 'Posted Citations',
'Inspection End Date', 'Classification', 'Project Area', 'Product Type',
'FMD-145 Date'],
dtype='object')
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 257624 entries, 0 to 257623
Data columns (total 14 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 FEI Number 257624 non-null int64
1 Legal Name 257624 non-null object
2 City 257624 non-null object
3 State 257624 non-null object
4 Zip 257624 non-null object
5 Country/Area 257624 non-null object
6 Fiscal Year 257624 non-null int64
7 Inspection ID 257624 non-null int64
8 Posted Citations 257624 non-null object
9 Inspection End Date 257624 non-null datetime64[ns]
10 Classification 257624 non-null object
11 Project Area 257624 non-null object
12 Product Type 257624 non-null object
13 FMD-145 Date 257624 non-null object
dtypes: datetime64[ns](1), int64(3), object(10)
memory usage: 27.5+ MB
## File2 DataFrame
df2.head(15)
| FEI Number | Legal Name | City | State | Zip | Country/Area | Fiscal Year | Inspection ID | Posted Citations | Inspection End Date | Classification | Project Area | Product Type | FMD-145 Date | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 3017202975 | PAVmed Inc. | Foxboro | Massachusetts | 02035 | United States | 2022 | 1167299 | Yes | 2022-04-15 | Voluntary Action Indicated (VAI) | Compliance: Devices | Devices | - |
| 1 | 3017202975 | PAVmed Inc. | Foxboro | Massachusetts | 02035 | United States | 2022 | 1167299 | Yes | 2022-04-15 | No Action Indicated (NAI) | Postmarket Assurance: Devices | Devices | - |
| 2 | 3000138835 | Ten Ren Tea and Ginseng Co Inc | New York | New York | 10013 | United States | 2022 | 1167517 | No | 2022-04-15 | No Action Indicated (NAI) | Foodborne Biological Hazards | Food/Cosmetics | - |
| 3 | 3007086121 | Mass Commodities, LLC | Miami | Florida | 33131 | United States | 2022 | 1166850 | No | 2022-04-14 | No Action Indicated (NAI) | Foodborne Biological Hazards | Food/Cosmetics | - |
| 4 | 3020871464 | Blood Assurance inc | Hermitage | Tennessee | 37076 | United States | 2022 | 1167480 | No | 2022-04-14 | No Action Indicated (NAI) | Blood and Blood Products | Biologics | - |
| 5 | 3020871464 | Blood Assurance inc | Hermitage | Tennessee | 37076 | United States | 2022 | 1167480 | No | 2022-04-14 | No Action Indicated (NAI) | Vaccines and Allergenic Products | Biologics | - |
| 6 | 3001877793 | Marriott P.R. Management Corporation | San Juan | Puerto Rico | 00907 | United States | 2022 | 1166891 | No | 2022-04-13 | No Action Indicated (NAI) | Foodborne Biological Hazards | Food/Cosmetics | - |
| 7 | 3006463947 | Ximedica LLC | Providence | Rhode Island | 02907 | United States | 2022 | 1167002 | Yes | 2022-04-13 | Voluntary Action Indicated (VAI) | Compliance: Devices | Devices | - |
| 8 | 3006463947 | Ximedica LLC | Providence | Rhode Island | 02907 | United States | 2022 | 1167002 | Yes | 2022-04-13 | No Action Indicated (NAI) | Postmarket Assurance: Devices | Devices | - |
| 9 | 3006948344 | Snavely's Mill, Inc. | Palmyra | Pennsylvania | 17078 | United States | 2022 | 1167076 | No | 2022-04-13 | No Action Indicated (NAI) | Foodborne Biological Hazards | Food/Cosmetics | - |
| 10 | 3006948344 | Snavely's Mill, Inc. | Palmyra | Pennsylvania | 17078 | United States | 2022 | 1167076 | No | 2022-04-13 | No Action Indicated (NAI) | Monitoring of Marketed Animal Drugs, Feed, and... | Veterinary | - |
| 11 | 3004354270 | Ralphs/Food4Less Returns Processing Center | Ontario | California | 91761 | United States | 2022 | 1167100 | No | 2022-04-13 | No Action Indicated (NAI) | Monitoring of Marketed Animal Drugs, Feed, and... | Veterinary | - |
| 12 | 3014553208 | DELAVIUDA USA INC | Coral Gables | Florida | 33134 | United States | 2022 | 1166400 | No | 2022-04-12 | No Action Indicated (NAI) | Foodborne Biological Hazards | Food/Cosmetics | - |
| 13 | 3012386632 | OKE USA FRUIT COMPANY | West Bridgewater | Massachusetts | 02379 | United States | 2022 | 1166922 | No | 2022-04-12 | No Action Indicated (NAI) | Foodborne Biological Hazards | Food/Cosmetics | - |
| 14 | 3009492128 | Exogenesis Corporation | Billerica | Massachusetts | 01821 | United States | 2022 | 1167016 | Yes | 2022-04-12 | Voluntary Action Indicated (VAI) | Compliance: Devices | Devices | - |
## Removing Space from Columns Names
df1.columns = df1.columns.str.replace(' ', '')
df2.columns = df2.columns.str.replace(' ', '')
## Merging both DataFrames based on their common columns (FEI Number, Inspection ID, Inspection End Date, Legal Name)
merged_df = pd.merge(df1, df2, how="left", on=["FEINumber", "InspectionID", "InspectionEndDate", "LegalName"])
merged_df.columns = merged_df.columns.str.replace(' ', '')
merged_df
| InspectionID | FEINumber | LegalName | InspectionEndDate | ProgramArea | Act/CFRNumber | ShortDescription | LongDescription | City | State | Zip | Country/Area | FiscalYear | PostedCitations | Classification | ProjectArea | ProductType | FMD-145Date | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1167299 | 3017202975 | PAVmed Inc. | 2022-04-15 | Devices | 21 CFR 820.100(b) | Documentation | Corrective and preventive action activities an... | Foxboro | Massachusetts | 02035 | United States | 2022 | Yes | Voluntary Action Indicated (VAI) | Compliance: Devices | Devices | - |
| 1 | 1167299 | 3017202975 | PAVmed Inc. | 2022-04-15 | Devices | 21 CFR 820.100(b) | Documentation | Corrective and preventive action activities an... | Foxboro | Massachusetts | 02035 | United States | 2022 | Yes | No Action Indicated (NAI) | Postmarket Assurance: Devices | Devices | - |
| 2 | 1167299 | 3017202975 | PAVmed Inc. | 2022-04-15 | Devices | 21 CFR 820.198(a) | Lack of or inadequate complaint procedures | Procedures for receiving, reviewing, and evalu... | Foxboro | Massachusetts | 02035 | United States | 2022 | Yes | Voluntary Action Indicated (VAI) | Compliance: Devices | Devices | - |
| 3 | 1167299 | 3017202975 | PAVmed Inc. | 2022-04-15 | Devices | 21 CFR 820.198(a) | Lack of or inadequate complaint procedures | Procedures for receiving, reviewing, and evalu... | Foxboro | Massachusetts | 02035 | United States | 2022 | Yes | No Action Indicated (NAI) | Postmarket Assurance: Devices | Devices | - |
| 4 | 1167002 | 3006463947 | Ximedica LLC | 2022-04-13 | Devices | 21 CFR 820.30(g) | Design validation - production units | The design was not validated under defined ope... | Providence | Rhode Island | 02907 | United States | 2022 | Yes | Voluntary Action Indicated (VAI) | Compliance: Devices | Devices | - |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 313042 | 540313 | 3000203414 | Avanti Foods, Inc. | 2008-10-01 | Foods | 21 CFR 123.6(c)(4) | Monitoring - adequacy | Your HACCP plan lists monitoring procedures an... | Milwaukee | Wisconsin | 53214 | United States | 2009 | Yes | Voluntary Action Indicated (VAI) | Foodborne Biological Hazards | Food/Cosmetics | - |
| 313043 | 540313 | 3000203414 | Avanti Foods, Inc. | 2008-10-01 | Foods | 21 CFR 123.7(d) | Corrective action documentation | You do not have records that document correcti... | Milwaukee | Wisconsin | 53214 | United States | 2009 | Yes | Voluntary Action Indicated (VAI) | Foodborne Biological Hazards | Food/Cosmetics | - |
| 313044 | 541050 | 3003895195 | Little Lad's Bakery, Inc. | 2008-10-01 | Foods | 21 CFR 110.35(a) | Buildings/good repair | Failure to maintain physical facilities in rep... | Corinth | Maine | 04427 | United States | 2009 | Yes | Voluntary Action Indicated (VAI) | Foodborne Biological Hazards | Food/Cosmetics | - |
| 313045 | 541050 | 3003895195 | Little Lad's Bakery, Inc. | 2008-10-01 | Foods | 21 CFR 110.35(a) | Buildings/sanitary | Failure to maintain buildings, fixtures, or ot... | Corinth | Maine | 04427 | United States | 2009 | Yes | Voluntary Action Indicated (VAI) | Foodborne Biological Hazards | Food/Cosmetics | - |
| 313046 | 541050 | 3003895195 | Little Lad's Bakery, Inc. | 2008-10-01 | Foods | 21 CFR 110.35(c) | Lack of effective pest exclusion | Effective measures are not being taken to excl... | Corinth | Maine | 04427 | United States | 2009 | Yes | Voluntary Action Indicated (VAI) | Foodborne Biological Hazards | Food/Cosmetics | - |
313047 rows × 18 columns
## Printing Number of values in each dataframes
print(df1.isnull().sum(),"\n")
print(df2.isnull().sum(),"\n")
print(merged_df.isnull().sum())
InspectionID 0 FEINumber 0 LegalName 0 InspectionEndDate 0 ProgramArea 0 Act/CFRNumber 181 ShortDescription 0 LongDescription 3 dtype: int64 FEINumber 0 LegalName 0 City 0 State 0 Zip 0 Country/Area 0 FiscalYear 0 InspectionID 0 PostedCitations 0 InspectionEndDate 0 Classification 0 ProjectArea 0 ProductType 0 FMD-145Date 0 dtype: int64 InspectionID 0 FEINumber 0 LegalName 0 InspectionEndDate 0 ProgramArea 0 Act/CFRNumber 199 ShortDescription 0 LongDescription 3 City 0 State 0 Zip 0 Country/Area 0 FiscalYear 0 PostedCitations 0 Classification 0 ProjectArea 0 ProductType 0 FMD-145Date 0 dtype: int64
## Value counts of Legal Name
pd.DataFrame(merged_df['LegalName'].value_counts())
| LegalName | |
|---|---|
| Titan Medical Enterprises, Inc. DBA U.S. Apothecary Labs | 385 |
| Tropical Bakery Wholesale Corp | 215 |
| Invacare Corporation | 210 |
| Agropharma Laboratories Inc. | 202 |
| H & P Industries, Inc. | 198 |
| ... | ... |
| Phu Thinh Food Processing Export Joint Stock Company | 1 |
| Pigeon Feeds Inc. | 1 |
| Lenter, Fred | 1 |
| Dustin A. Bubar | 1 |
| Shantou Chaonan Liangying Hegao Plastic Toy Factory | 1 |
36615 rows × 1 columns
## Value counts of Program Area
pd.DataFrame(merged_df['ProgramArea'].value_counts())
| ProgramArea | |
|---|---|
| Foods | 168969 |
| Devices | 72912 |
| Drugs | 42906 |
| Veterinary Medicine | 8522 |
| Bioresearch Monitoring | 6613 |
| Human Tissue for Transplantation | 4643 |
| Biologics | 4606 |
| Parts 1240 and 1250 | 2707 |
| Radiologic Health | 940 |
| Part 11 Compliance | 229 |
## Value counts of Cities
pd.DataFrame(merged_df['City'].value_counts())
| City | |
|---|---|
| Miami | 4668 |
| Dallas | 2748 |
| Brooklyn | 2556 |
| Chicago | 2529 |
| Hialeah | 2343 |
| ... | ... |
| Kodama-Gun | 1 |
| A.Makham | 1 |
| Moorabbin | 1 |
| Arlington Hts | 1 |
| Spring Church | 1 |
9666 rows × 1 columns
## Value counts of State
pd.DataFrame(merged_df['State'].value_counts())
| State | |
|---|---|
| - | 41176 |
| California | 39319 |
| Florida | 31894 |
| New York | 21121 |
| Texas | 15142 |
| Ohio | 10438 |
| Illinois | 9634 |
| Pennsylvania | 8718 |
| New Jersey | 8654 |
| Massachusetts | 8355 |
| Washington | 8120 |
| Michigan | 7247 |
| Wisconsin | 6857 |
| Colorado | 6771 |
| North Carolina | 5909 |
| Puerto Rico | 5770 |
| Georgia | 5618 |
| Minnesota | 5119 |
| Utah | 4894 |
| Tennessee | 4353 |
| Indiana | 4171 |
| Missouri | 4100 |
| Virginia | 3980 |
| Maryland | 3954 |
| Kentucky | 3821 |
| Oregon | 3673 |
| Connecticut | 3080 |
| Arizona | 2815 |
| Maine | 2303 |
| South Carolina | 1943 |
| Kansas | 1860 |
| Louisiana | 1775 |
| Iowa | 1609 |
| Oklahoma | 1567 |
| Idaho | 1454 |
| New Mexico | 1449 |
| Rhode Island | 1446 |
| Alabama | 1419 |
| Nevada | 1405 |
| Hawaii | 1276 |
| New Hampshire | 1137 |
| Arkansas | 1129 |
| Vermont | 1113 |
| Mississippi | 792 |
| Nebraska | 771 |
| West Virginia | 685 |
| Montana | 659 |
| Alaska | 645 |
| District of Columbia | 418 |
| Delaware | 386 |
| North Dakota | 296 |
| Wyoming | 258 |
| South Dakota | 227 |
| Virgin Islands | 219 |
| Guam | 60 |
| American Samoa | 43 |
## Value counts of Country/Area
pd.DataFrame(merged_df['Country/Area'].value_counts())
| Country/Area | |
|---|---|
| United States | 271871 |
| India | 2803 |
| Canada | 2767 |
| Germany | 2353 |
| China | 2236 |
| ... | ... |
| Bosnia-Hercegovina | 2 |
| Egypt | 2 |
| Cape Verde | 2 |
| Faroe Islands | 2 |
| Tunisia | 2 |
107 rows × 1 columns
## Value counts of Classifications
pd.DataFrame(merged_df['Classification'].value_counts())
| Classification | |
|---|---|
| Voluntary Action Indicated (VAI) | 215608 |
| Official Action Indicated (OAI) | 54377 |
| No Action Indicated (NAI) | 43062 |
## Value counts of Project Areas
pd.DataFrame(merged_df['ProjectArea'].value_counts())
| ProjectArea | |
|---|---|
| Foodborne Biological Hazards | 111517 |
| Compliance: Devices | 41372 |
| Food Composition, Standards, Labeling and Econ | 35218 |
| Drug Quality Assurance | 30502 |
| Postmarket Assurance: Devices | 28573 |
| Monitoring of Marketed Animal Drugs, Feed, and Devices | 15017 |
| Bioresearch Monitoring | 9921 |
| Pesticides and Chemical Contaminants | 9420 |
| Food and Color Additives Petition Review | 8377 |
| Blood and Blood Products | 5653 |
| Human Cellular, Tissue, and Gene Therapies | 4609 |
| Technical Assistance: Food and Cosmetics | 4468 |
| Colors and Cosmetics Technology | 3020 |
| Unapproved and Misbranded Drugs | 2658 |
| Radiation Control and Health Safety Act | 1758 |
| Pre-Approval Eval. of Animal Drugs and Food Additives | 378 |
| Vaccines and Allergenic Products | 299 |
| Postmarket Surv. and Epidemiology | 161 |
| Over-the-Counter Drug Evaluation | 87 |
| Molecular Biology and Natural Toxins | 28 |
| Prescription Drug Advertising and Labelling | 8 |
| Tobacco Post-Market Activities | 3 |
## Value counts of Product Type
pd.DataFrame(merged_df['ProductType'].value_counts())
| ProductType | |
|---|---|
| Food/Cosmetics | 172054 |
| Devices | 75094 |
| Drugs | 39138 |
| Veterinary | 15427 |
| Biologics | 11331 |
| Tobacco | 3 |
## Reference - https://stackoverflow.com/questions/38783027/jupyter-notebook-display-two-pandas-tables-side-by-side
from IPython.display import display_html
from itertools import chain,cycle
def display_side_by_side(*args,titles=cycle([''])):
html_str=''
for df,title in zip(args, chain(titles,cycle(['</br>'])) ):
html_str+='<th style="text-align:center"><td style="vertical-align:top">'
html_str+=f'<h2>{title}</h2>'
html_str+=df.to_html().replace('table','table style="display:inline"')
html_str+='</td></th>'
display_html(html_str,raw=True)
## Value counts of Short Description & Long Description
short_df = pd.DataFrame(merged_df['ShortDescription'].value_counts().head(10))
long_df = pd.DataFrame(merged_df['LongDescription'].value_counts().head(10))
display_side_by_side(short_df,long_df, titles=['Short Description','Long Description'])
| ShortDescription | |
|---|---|
| Lack of or inadequate procedures | 6099 |
| Sanitation monitoring | 5124 |
| Lack of effective pest exclusion | 4967 |
| Lack of or inadequate complaint procedures | 4341 |
| Screening | 3293 |
| Documentation | 3287 |
| HACCP plan implementation | 3037 |
| Floors, walls and ceilings | 2742 |
| Lack of Written MDR Procedures | 2622 |
| Buildings/sanitary | 2537 |
| LongDescription | |
|---|---|
| Procedures for corrective and preventive action have not been adequately established. | 4537 |
| Procedures for receiving, reviewing, and evaluating complaints by a formally designated unit have not been adequately established. | 3311 |
| Failure to provide adequate screening or other protection against pests. | 3293 |
| Failure to maintain buildings, fixtures, or other physical facilities in a sanitary condition. | 2537 |
| You did not develop an FSVP. | 2237 |
| Effective measures are not being taken to exclude pests from the processing areas. | 2220 |
| Your HACCP plan does not list the food safety hazards that are reasonably likely to occur. | 2214 |
| Your HACCP plan does not list one or more critical control points that are necessary for each of the identified food safety hazards. | 2105 |
| You do not have a written HACCP plan that outlines controls for a food safety hazard that is reasonably likely to occur. | 1866 |
| All reasonable precautions are not taken to ensure that production procedures do not contribute contamination from any source. | 1839 |
## Product Type wise Short Description - Maryland State
byProductType = pd.DataFrame(merged_df.groupby('State').ShortDescription.value_counts())
pd.DataFrame(byProductType.unstack().T['Maryland'].sort_values(ascending=False).head(15))
| Maryland | ||
|---|---|---|
| ShortDescription | ||
| ShortDescription | Lack of effective pest exclusion | 115.0 |
| Sanitation monitoring | 86.0 | |
| Lack of or inadequate procedures | 66.0 | |
| Precautions against contamination--micro, foreign substances | 60.0 | |
| Screening | 59.0 | |
| Failure to wear | 59.0 | |
| HACCP plan implementation | 47.0 | |
| Harborage areas | 45.0 | |
| Not washed/sanitized when appropriate | 44.0 | |
| Documentation | 39.0 | |
| Floors, walls and ceilings | 37.0 | |
| FD-1572, protocol compliance | 36.0 | |
| Personnel | 34.0 | |
| Cleaning and sanitizing operations | 32.0 | |
| Buildings/good repair | 32.0 |
## Product Type wise Long Description - Maryland State
byProductType = pd.DataFrame(merged_df.groupby('State').LongDescription.value_counts())
pd.DataFrame(byProductType.unstack().T['Maryland'].sort_values(ascending=False).head(15))
| Maryland | ||
|---|---|---|
| LongDescription | ||
| LongDescription | Failure to provide adequate screening or other protection against pests. | 59.0 |
| Effective measures are not being taken to exclude pests from the processing areas. | 55.0 | |
| Procedures for corrective and preventive action have not been adequately established. | 54.0 | |
| Effective measures are not being taken to exclude pests from the processing areas and protect against the contamination of food on the premises by pests. | 31.0 | |
| Failure to maintain buildings, fixtures, or other physical facilities in a sanitary condition. | 30.0 | |
| You do not have a written HACCP plan that outlines controls for a food safety hazard that is reasonably likely to occur. | 29.0 | |
| Your HACCP plan does not list one or more critical control points that are necessary for each of the identified food safety hazards. | 28.0 | |
| You did not take a reasonable measure and precaution related to personnel practices. | 27.0 | |
| Failure to wear beard covers where appropriate. | 26.0 | |
| You did not exclude pests from your food plant to protect against contamination of food. | 26.0 | |
| No one associated with your firm has completed the required HACCP training or is HACCP qualified through job experience. | 25.0 | |
| Effective measures are not being taken to protect against the contamination of food on the premises by pests. | 22.0 | |
| An investigation was not conducted in accordance with the investigational plan. | 22.0 | |
| Procedures for design control have not been established. | 21.0 | |
| All reasonable precautions are not taken to ensure that production procedures do not contribute contamination from any source. | 20.0 |
## Product Type wise Short Description - Biologics
byProductType = pd.DataFrame(merged_df.groupby('ProductType').ShortDescription.value_counts())
byProductType.unstack().T.sort_values(by=['Biologics'], ascending=False).head(15)
| ProductType | Biologics | Devices | Drugs | Food/Cosmetics | Tobacco | Veterinary | |
|---|---|---|---|---|---|---|---|
| ShortDescription | |||||||
| ShortDescription | Establish, maintain and follow manufacturing SOPs | 985.0 | 1.0 | 1.0 | NaN | NaN | NaN |
| Risk factors, clinical evidence | 365.0 | 1.0 | 1.0 | NaN | NaN | NaN | |
| Thorough investigations | 327.0 | 2.0 | 4.0 | NaN | NaN | NaN | |
| Testing, screening, donor eligibility procedures | 273.0 | 2.0 | NaN | NaN | NaN | NaN | |
| Concurrent documentation | 232.0 | 3.0 | 2.0 | NaN | NaN | NaN | |
| Procedures for all other requirements | 220.0 | 1.0 | NaN | NaN | NaN | NaN | |
| Required records | 213.0 | 1.0 | 1.0 | NaN | NaN | NaN | |
| Person performing, test results, interpretation | 208.0 | 1.0 | NaN | NaN | NaN | NaN | |
| FD-1572, protocol compliance | 204.0 | 9.0 | 1339.0 | 1.0 | NaN | 3.0 | |
| Biological product deviation report | 203.0 | 4.0 | 5.0 | NaN | NaN | NaN | |
| Procedures to meet core CTGP | 186.0 | 8.0 | NaN | NaN | NaN | NaN | |
| Determination based on screening and testing | 181.0 | 2.0 | NaN | NaN | NaN | NaN | |
| Case history records- inadequate or inadequate | 140.0 | 5.0 | 796.0 | NaN | NaN | 3.0 | |
| Responsible person to determine, document | 132.0 | NaN | NaN | NaN | NaN | NaN | |
| Accurate, indelible, legible | 131.0 | NaN | NaN | NaN | NaN | NaN |
## Product Type wise Long Description - Biologics
byClassification = pd.DataFrame(merged_df.groupby('ProductType').LongDescription.value_counts())
byClassification.unstack().T.sort_values(by=['Biologics'], ascending=False).head(15)
| ProductType | Biologics | Devices | Drugs | Food/Cosmetics | Tobacco | Veterinary | |
|---|---|---|---|---|---|---|---|
| LongDescription | |||||||
| LongDescription | Donors were not screened by a review of relevant medical records for risk factors of communicable disease agents and diseases. | 197.0 | NaN | 1.0 | NaN | NaN | NaN |
| HCT/P donors were not determined to be eligible based on the results of donor screening and testing. | 181.0 | 2.0 | NaN | NaN | NaN | NaN | |
| Failure to submit a biological product deviation report within 45 days from the date you acquired information suggesting that a reportable event occurred. | 143.0 | 3.0 | 5.0 | NaN | NaN | NaN | |
| Donors were not screened by a review of relevant medical records for risk factors and clinical evidence of communicable disease agents and diseases. | 136.0 | 1.0 | NaN | NaN | NaN | NaN | |
| Failure to perform a thorough investigation and make a record of the conclusions and follow-up of an unexplained discrepancy. | 129.0 | 1.0 | NaN | NaN | NaN | NaN | |
| Failure to use supplies and reagents in a manner consistent with instructions provided by the manufacturer. | 110.0 | NaN | NaN | NaN | NaN | NaN | |
| An investigation was not conducted in accordance with the investigational plan. | 104.0 | 101.0 | 569.0 | NaN | NaN | 2.0 | |
| Donor specimens used for testing of communicable disease agents were not collected at the appropriate time. | 95.0 | NaN | NaN | NaN | NaN | NaN | |
| An investigation was not conducted in accordance with the signed statement of investigator and investigational plan. | 88.0 | 3.0 | 645.0 | 1.0 | NaN | NaN | |
| Storage temperatures of HCT/Ps were not recorded and maintained. | 86.0 | 2.0 | NaN | NaN | NaN | NaN | |
| The eligibility of an HCT/P donor was not determined and documented by a responsible person, based on results of donor screening and donor testing. | 84.0 | NaN | NaN | NaN | NaN | NaN | |
| All records pertinent to a lot or unit were not reviewed before the release or distribution of a lot or unit of final product. | 79.0 | NaN | 1.0 | NaN | NaN | NaN | |
| Donors were not tested for evidence of infection with relevant communicable disease agents. | 75.0 | NaN | NaN | NaN | NaN | NaN | |
| Failure to perform a thorough investigation of an unexplained discrepancy. | 69.0 | NaN | 2.0 | NaN | NaN | NaN | |
| HCT/Ps for which the donor eligibility determination was not performed were not prominently labeled with the appropriate warning statements. | 68.0 | NaN | NaN | NaN | NaN | NaN |
## Product Type wise Short Description - Devices
byClassification = pd.DataFrame(merged_df.groupby('ProductType').ShortDescription.value_counts())
byClassification.unstack().T.sort_values(by=['Devices'], ascending=False).head(15)
| ProductType | Biologics | Devices | Drugs | Food/Cosmetics | Tobacco | Veterinary | |
|---|---|---|---|---|---|---|---|
| ShortDescription | |||||||
| ShortDescription | Lack of or inadequate procedures | 31.0 | 5995.0 | 50.0 | 20.0 | NaN | 3.0 |
| Lack of or inadequate complaint procedures | 20.0 | 4284.0 | 22.0 | 15.0 | NaN | NaN | |
| Documentation | 57.0 | 3171.0 | 46.0 | 12.0 | NaN | 1.0 | |
| Lack of Written MDR Procedures | 22.0 | 2562.0 | 25.0 | 12.0 | NaN | 1.0 | |
| Purchasing controls, Lack of or inadequate procedures | 11.0 | 2298.0 | 24.0 | 11.0 | NaN | NaN | |
| Lack of or inadequate process validation | 10.0 | 2025.0 | 26.0 | 11.0 | NaN | 3.0 | |
| Nonconforming product, Lack of or inadequate procedures | 8.0 | 1833.0 | 14.0 | 4.0 | NaN | NaN | |
| Design changes - Lack of or Inadequate Procedures | 7.0 | 1478.0 | 11.0 | 2.0 | NaN | NaN | |
| Quality audits - Lack of or inadequate procedures | 5.0 | 1451.0 | 13.0 | 2.0 | NaN | 1.0 | |
| DMR - not or inadequately maintained | 7.0 | 1175.0 | 10.0 | 4.0 | NaN | NaN | |
| Investigation of device failures | 4.0 | 1133.0 | 15.0 | 2.0 | NaN | 1.0 | |
| Training - Lack of or inadequate procedures | 11.0 | 1010.0 | 10.0 | 2.0 | NaN | NaN | |
| Design control - no procedures | 7.0 | 977.0 | 15.0 | 3.0 | NaN | NaN | |
| Lack of or inadequate final acceptance procedures | 9.0 | 964.0 | 14.0 | 6.0 | NaN | 3.0 | |
| Calibration, Inspection, etc. Procedures Lack of or Inadequ | 2.0 | 901.0 | 6.0 | NaN | NaN | 1.0 |
## Product Type wise Long Description - Devices
byClassification = pd.DataFrame(merged_df.groupby('ProductType').LongDescription.value_counts())
byClassification.unstack().T.sort_values(by=['Devices'], ascending=False).head(15)
| ProductType | Biologics | Devices | Drugs | Food/Cosmetics | Tobacco | Veterinary | |
|---|---|---|---|---|---|---|---|
| LongDescription | |||||||
| LongDescription | Procedures for corrective and preventive action have not been adequately established. | 18.0 | 4475.0 | 34.0 | 7.0 | NaN | 3.0 |
| Procedures for receiving, reviewing, and evaluating complaints by a formally designated unit have not been adequately established. | 15.0 | 3275.0 | 15.0 | 6.0 | NaN | NaN | |
| Procedures to ensure that all purchased or otherwise received product and services conform to specified requirements have not been adequately established. | 6.0 | 1481.0 | 11.0 | 3.0 | NaN | NaN | |
| Procedures have not been adequately established to control product that does not conform to specified requirements. | 5.0 | 1286.0 | 11.0 | 4.0 | NaN | NaN | |
| Corrective and preventive action activities and/or results have not been adequately documented. | 20.0 | 1274.0 | 21.0 | 7.0 | NaN | 1.0 | |
| Procedures for corrective and preventive action have not been established. | 12.0 | 1260.0 | 13.0 | 11.0 | NaN | NaN | |
| A process whose results cannot be fully verified by subsequent inspection and test has not been adequately validated according to established procedures. | 7.0 | 1169.0 | 14.0 | 5.0 | NaN | 2.0 | |
| Written MDR procedures have not been developed. | 4.0 | 1099.0 | 7.0 | 2.0 | NaN | NaN | |
| Procedures for design change have not been adequately established. | 3.0 | 1052.0 | 4.0 | 1.0 | NaN | NaN | |
| Procedures for receiving, reviewing, and evaluating complaints by a formally designated unit have not been established. | 5.0 | 1007.0 | 7.0 | 9.0 | NaN | NaN | |
| Procedures for quality audits have not been adequately established. | 2.0 | 841.0 | 7.0 | 1.0 | NaN | NaN | |
| Procedures for design control have not been established. | 6.0 | 820.0 | 12.0 | 2.0 | NaN | NaN | |
| An MDR report was not submitted within 30 days of receiving or otherwise becoming aware of information that reasonably suggests that a marketed device has malfunctioned and would be likely to cause or contribute to a death or serious injury if the malfunction were to recur. | 13.0 | 815.0 | 3.0 | NaN | NaN | 1.0 | |
| Procedures to ensure that all purchased or otherwise received product and services conform to specified requirements have not been established. | 2.0 | 680.0 | 11.0 | 7.0 | NaN | NaN | |
| Procedures for training and identifying training needs have not been adequately established. | 9.0 | 673.0 | 6.0 | 1.0 | NaN | NaN |
## Product Type wise Short Description - Drugs
byClassification = pd.DataFrame(merged_df.groupby('ProductType').ShortDescription.value_counts())
byClassification.unstack().T.sort_values(by=['Drugs'], ascending=False).head(15)
| ProductType | Biologics | Devices | Drugs | Food/Cosmetics | Tobacco | Veterinary | |
|---|---|---|---|---|---|---|---|
| ShortDescription | |||||||
| ShortDescription | Procedures not in writing, fully followed | 65.0 | 61.0 | 1587.0 | 197.0 | NaN | 222.0 |
| FD-1572, protocol compliance | 204.0 | 9.0 | 1339.0 | 1.0 | NaN | 3.0 | |
| Scientifically sound laboratory controls | 32.0 | 43.0 | 994.0 | 114.0 | NaN | 148.0 | |
| Investigations of discrepancies, failures | 32.0 | 58.0 | 973.0 | 90.0 | NaN | 151.0 | |
| Absence of Written Procedures | 27.0 | 46.0 | 891.0 | 168.0 | NaN | 149.0 | |
| Case history records- inadequate or inadequate | 140.0 | 5.0 | 796.0 | NaN | NaN | 3.0 | |
| Testing and release for distribution | 4.0 | 16.0 | 669.0 | 73.0 | NaN | 117.0 | |
| Written procedures not established/followed | 6.0 | 19.0 | 640.0 | 97.0 | NaN | 103.0 | |
| Cleaning / Sanitizing / Maintenance | 4.0 | 26.0 | 632.0 | 73.0 | NaN | 101.0 | |
| Control procedures to monitor and validate performance | 13.0 | 37.0 | 612.0 | 90.0 | NaN | 85.0 | |
| Calibration/Inspection/Checking not done | 38.0 | 26.0 | 528.0 | 61.0 | NaN | 77.0 | |
| Lack of written stability program | 14.0 | 23.0 | 527.0 | 84.0 | NaN | 104.0 | |
| Training--operations, GMPs, written procedures | 15.0 | 17.0 | 522.0 | 91.0 | NaN | 74.0 | |
| SOPs not followed / documented | 65.0 | 29.0 | 486.0 | 35.0 | NaN | 86.0 | |
| Prepared for each batch, include complete information | 10.0 | 18.0 | 460.0 | 77.0 | NaN | 79.0 |
## Product Type wise Long Description - Drugs
byClassification = pd.DataFrame(merged_df.groupby('ProductType').LongDescription.value_counts())
byClassification.unstack().T.sort_values(by=['Drugs'], ascending=False).head(15)
| ProductType | Biologics | Devices | Drugs | Food/Cosmetics | Tobacco | Veterinary | |
|---|---|---|---|---|---|---|---|
| LongDescription | |||||||
| LongDescription | There are no written procedures for production and process controls designed to assure that the drug products have the identity, strength, quality, and purity they purport or are represented to possess. | 27.0 | 46.0 | 891.0 | 168.0 | NaN | 149.0 |
| The responsibilities and procedures applicable to the quality control unit are not fully followed. | 43.0 | 32.0 | 656.0 | 59.0 | NaN | 92.0 | |
| An investigation was not conducted in accordance with the signed statement of investigator and investigational plan. | 88.0 | 3.0 | 645.0 | 1.0 | NaN | NaN | |
| The responsibilities and procedures applicable to the quality control unit are not in writing and fully followed. | 14.0 | 18.0 | 570.0 | 71.0 | NaN | 73.0 | |
| An investigation was not conducted in accordance with the investigational plan. | 104.0 | 101.0 | 569.0 | NaN | NaN | 2.0 | |
| There is no written testing program designed to assess the stability characteristics of drug products. | 14.0 | 23.0 | 527.0 | 84.0 | NaN | 104.0 | |
| There is a failure to thoroughly review any unexplained discrepancy and the failure of a batch or any of its components to meet any of its specifications whether or not the batch has been already distributed. | 11.0 | 32.0 | 453.0 | 47.0 | NaN | 71.0 | |
| There is no quality control unit. | 16.0 | 14.0 | 357.0 | 86.0 | NaN | 64.0 | |
| Control procedures are not established which validate the performance of those manufacturing processes that may be responsible for causing variability in the characteristics of in-process material and the drug product. | 11.0 | 19.0 | 340.0 | 64.0 | NaN | 55.0 | |
| There is a failure to thoroughly review any unexplained discrepancy whether or not the batch has been already distributed. | 14.0 | 12.0 | 334.0 | 29.0 | NaN | 46.0 | |
| The responsibilities and procedures applicable to the quality control unit are not in writing. | 7.0 | 11.0 | 329.0 | 65.0 | NaN | 52.0 | |
| Batch production and control records do not include complete information relating to the production and control of each batch. | 5.0 | 13.0 | 313.0 | 51.0 | NaN | 58.0 | |
| Failure to prepare or maintain adequate and accurate case histories with respect to observations and data pertinent to the investigation. | 52.0 | 2.0 | 311.0 | NaN | NaN | 1.0 | |
| Written production and process control procedures are not followed in the execution of production and process control functions. | 35.0 | 16.0 | 306.0 | 18.0 | NaN | 57.0 | |
| Written procedures are not established for the cleaning and maintenance of equipment, including utensils, used in the manufacture, processing, packing or holding of a drug product. | 5.0 | 9.0 | 296.0 | 52.0 | NaN | 45.0 |
## Product Type wise Short Description - Veterinary
byClassification = pd.DataFrame(merged_df.groupby('ProductType').ShortDescription.value_counts())
byClassification.unstack().T.sort_values(by=['Veterinary'], ascending=False).head(15)
| ProductType | Biologics | Devices | Drugs | Food/Cosmetics | Tobacco | Veterinary | |
|---|---|---|---|---|---|---|---|
| ShortDescription | |||||||
| ShortDescription | Record keeping | NaN | NaN | NaN | 5.0 | NaN | 1700.0 |
| ELDU - producer causes Illegal drug residue | NaN | NaN | NaN | NaN | NaN | 1185.0 | |
| Drug inventory (control of drugs on premises) | NaN | NaN | NaN | 4.0 | NaN | 563.0 | |
| Expired drugs administered | NaN | NaN | NaN | 2.0 | NaN | 338.0 | |
| Individual identity of animal(s) | NaN | NaN | NaN | NaN | NaN | 334.0 | |
| ELDU without veterinary client-patient relationship | NaN | NaN | NaN | 1.0 | NaN | 282.0 | |
| Prescribed ELDU - vet causes an illegal drug residue | NaN | NaN | NaN | NaN | NaN | 260.0 | |
| Prescribed directions - not followed | NaN | NaN | NaN | NaN | NaN | 250.0 | |
| Procedures not in writing, fully followed | 65.0 | 61.0 | 1587.0 | 197.0 | NaN | 222.0 | |
| Records review prior to shipment for slaughter | NaN | NaN | NaN | NaN | NaN | 219.0 | |
| Drugs prohibited for extralabel use in food-producing animal(s) | NaN | NaN | NaN | 2.0 | NaN | 219.0 | |
| System for administration of drugs | NaN | NaN | NaN | 1.0 | NaN | 173.0 | |
| Dosage level | NaN | NaN | NaN | NaN | NaN | 156.0 | |
| Investigations of discrepancies, failures | 32.0 | 58.0 | 973.0 | 90.0 | NaN | 151.0 | |
| Absence of Written Procedures | 27.0 | 46.0 | 891.0 | 168.0 | NaN | 149.0 |
## Product Type wise Long Description - Veterinary
byClassification = pd.DataFrame(merged_df.groupby('ProductType').LongDescription.value_counts())
byClassification.unstack().T.sort_values(by=['Veterinary'], ascending=False).head(15)
| ProductType | Biologics | Devices | Drugs | Food/Cosmetics | Tobacco | Veterinary | |
|---|---|---|---|---|---|---|---|
| LongDescription | |||||||
| LongDescription | Causing a residue of an approved human or animal drug above an established safe level, safe concentration, or tolerance, through use of the drug contrary to its labeling. | NaN | NaN | NaN | NaN | NaN | 1180.0 |
| Treatment records were not complete. | NaN | NaN | NaN | 2.0 | NaN | 710.0 | |
| Treatment records were not maintained. | NaN | NaN | NaN | 2.0 | NaN | 562.0 | |
| Treatment records were not maintained and complete. | NaN | NaN | NaN | 1.0 | NaN | 359.0 | |
| Expired drug(s) were observed in the drug storage area. | NaN | NaN | NaN | 2.0 | NaN | 338.0 | |
| Use of an animal drug in a manner contrary to label directions without benefit of a valid veterinary client-patient relationship. | NaN | NaN | NaN | 1.0 | NaN | 264.0 | |
| You lack an adequate inventory system for determining the quantities of drugs used to medicate your livestock. | NaN | NaN | NaN | NaN | NaN | 228.0 | |
| Failure to systematically review treatment records prior to offering an animal for slaughter for human food, to assure that drugs have been used only as directed and that appropriate withdrawal times have been observed. | NaN | NaN | NaN | NaN | NaN | 217.0 | |
| Failure to have a system to control administration of drug treatments to your animals. | NaN | NaN | NaN | 1.0 | NaN | 171.0 | |
| You lack an adequate inventory system for determining the quantities of drugs used to medicate your cows. | NaN | NaN | NaN | 2.0 | NaN | 159.0 | |
| Administration of an approved animal drug in excess of the indicated dosage, without benefit of a valid veterinarian-client-patient relationship. | NaN | NaN | NaN | NaN | NaN | 156.0 | |
| There are no written procedures for production and process controls designed to assure that the drug products have the identity, strength, quality, and purity they purport or are represented to possess. | 27.0 | 46.0 | 891.0 | 168.0 | NaN | 149.0 | |
| Administration of an approved animal drug contrary to the labeling, without benefit of a valid veterinarian-client-patient relationship, in that pre-slaughter withdrawal time was not observed. | NaN | NaN | NaN | 1.0 | NaN | 139.0 | |
| A prohibited drug was administered in an extralabel manner to a food-producing animal(s). | NaN | NaN | NaN | NaN | NaN | 135.0 | |
| You lack an adequate inventory system for determining the quantities of drugs used to medicate your cows and calves. | NaN | NaN | NaN | NaN | NaN | 108.0 |
## Product Type wise Short Description - Food/Cosmetics
byClassification = pd.DataFrame(merged_df.groupby('ProductType').ShortDescription.value_counts())
byClassification.unstack().T.sort_values(by=['Food/Cosmetics'], ascending=False).head(15)
| ProductType | Biologics | Devices | Drugs | Food/Cosmetics | Tobacco | Veterinary | |
|---|---|---|---|---|---|---|---|
| ShortDescription | |||||||
| ShortDescription | Sanitation monitoring | NaN | NaN | 4.0 | 5073.0 | NaN | 47.0 |
| Lack of effective pest exclusion | NaN | 3.0 | 10.0 | 4854.0 | NaN | 100.0 | |
| Screening | NaN | 3.0 | 8.0 | 3230.0 | NaN | 52.0 | |
| HACCP plan implementation | NaN | NaN | 4.0 | 3020.0 | NaN | 13.0 | |
| Floors, walls and ceilings | NaN | NaN | 2.0 | 2701.0 | NaN | 39.0 | |
| Buildings/sanitary | NaN | NaN | 3.0 | 2499.0 | NaN | 35.0 | |
| Food safety hazards | NaN | NaN | 1.0 | 2203.0 | NaN | 10.0 | |
| Develop FSVP | NaN | NaN | NaN | 2166.0 | NaN | 71.0 | |
| Critical control points | NaN | NaN | 3.0 | 2092.0 | NaN | 10.0 | |
| Critical limits | NaN | NaN | NaN | 2073.0 | NaN | 8.0 | |
| No HACCP plan | NaN | NaN | 2.0 | 2072.0 | NaN | 7.0 | |
| Manufacturing conditions | NaN | 1.0 | 4.0 | 2043.0 | NaN | 28.0 | |
| Buildings/good repair | NaN | NaN | 3.0 | 1962.0 | NaN | 43.0 | |
| Harborage areas | NaN | NaN | 4.0 | 1960.0 | 1.0 | 39.0 | |
| Monitoring - adequacy | NaN | NaN | NaN | 1932.0 | NaN | 9.0 |
## Bar-graph Showing biggest problems found during Food/Cosmetic Inspection
byClassification.unstack().T['Food/Cosmetics'].sort_values(ascending=False).head(15).plot(kind='bar')
<AxesSubplot:xlabel='None,ShortDescription'>
## Product Type wise Long Description - Food/Cosmetics
byClassification = pd.DataFrame(merged_df.groupby('ProductType').LongDescription.value_counts())
byClassification.unstack().T.sort_values(by=['Food/Cosmetics'], ascending=False).head(15)
| ProductType | Biologics | Devices | Drugs | Food/Cosmetics | Tobacco | Veterinary | |
|---|---|---|---|---|---|---|---|
| LongDescription | |||||||
| LongDescription | Failure to provide adequate screening or other protection against pests. | NaN | 3.0 | 8.0 | 3230.0 | NaN | 52.0 |
| Failure to maintain buildings, fixtures, or other physical facilities in a sanitary condition. | NaN | NaN | 3.0 | 2499.0 | NaN | 35.0 | |
| Your HACCP plan does not list the food safety hazards that are reasonably likely to occur. | NaN | NaN | 1.0 | 2203.0 | NaN | 10.0 | |
| Effective measures are not being taken to exclude pests from the processing areas. | NaN | 3.0 | 3.0 | 2179.0 | NaN | 35.0 | |
| You did not develop an FSVP. | NaN | NaN | NaN | 2166.0 | NaN | 71.0 | |
| Your HACCP plan does not list one or more critical control points that are necessary for each of the identified food safety hazards. | NaN | NaN | 3.0 | 2092.0 | NaN | 10.0 | |
| You do not have a written HACCP plan that outlines controls for a food safety hazard that is reasonably likely to occur. | NaN | NaN | 2.0 | 1859.0 | NaN | 5.0 | |
| All reasonable precautions are not taken to ensure that production procedures do not contribute contamination from any source. | NaN | 1.0 | NaN | 1796.0 | NaN | 42.0 | |
| Effective measures are not being taken to exclude pests from the processing areas and protect against the contamination of food on the premises by pests. | NaN | NaN | NaN | 1644.0 | NaN | 35.0 | |
| Your HACCP plan lists a critical limit that does not ensure control of one or more hazards. | NaN | NaN | NaN | 1593.0 | NaN | 6.0 | |
| Hand-washing facilities lack running water of a suitable temperature. | NaN | 1.0 | 4.0 | 1512.0 | NaN | 24.0 | |
| Failure to store raw materials in a manner that protects against contamination. | NaN | 2.0 | NaN | 1279.0 | NaN | 14.0 | |
| Your HACCP plan includes a corrective action plan that is not in accordance with 21 CFR 123.7(b) to ensure the cause of the deviation was corrected. | NaN | NaN | NaN | 943.0 | NaN | 5.0 | |
| Effective measures are not being taken to protect against the contamination of food on the premises by pests. | NaN | NaN | 7.0 | 929.0 | NaN | 24.0 | |
| No one associated with your firm has completed the required HACCP training or is HACCP qualified through job experience. | NaN | NaN | 1.0 | 877.0 | NaN | 4.0 |
## Product Type wise Short Description - Tobacco
byClassification = pd.DataFrame(merged_df.groupby('ProductType').ShortDescription.value_counts())
byClassification.unstack().T.sort_values(by=['Tobacco'], ascending=False).head(15)
| ProductType | Biologics | Devices | Drugs | Food/Cosmetics | Tobacco | Veterinary | |
|---|---|---|---|---|---|---|---|
| ShortDescription | |||||||
| ShortDescription | Harborage areas | NaN | NaN | 4.0 | 1960.0 | 1.0 | 39.0 |
| Storage requirements | NaN | NaN | 1.0 | 1283.0 | 1.0 | 22.0 | |
| Sufficient space | NaN | 1.0 | 1.0 | 1139.0 | 1.0 | 9.0 | |
| "Use ONLY with timer" label | NaN | 3.0 | NaN | NaN | NaN | NaN | |
| 15 day reporting timeframe | 5.0 | 2.0 | NaN | NaN | NaN | NaN | |
| 15 minute interval (S) | NaN | NaN | NaN | 9.0 | NaN | NaN | |
| 24 hour retrieval of offsite records | NaN | NaN | NaN | 5.0 | NaN | NaN | |
| 30-day sponsor notification to FDA | NaN | 1.0 | NaN | NaN | NaN | NaN | |
| 50 mm classification warning | NaN | 6.0 | NaN | NaN | NaN | NaN | |
| 60 day updates | NaN | NaN | NaN | 2.0 | NaN | NaN | |
| >50% of members not at RDRC meeting | NaN | NaN | 2.0 | NaN | NaN | NaN | |
| Abbreviated procedure | 39.0 | NaN | NaN | NaN | NaN | NaN | |
| Abbreviated requirements / non-significant risk study | 1.0 | 10.0 | NaN | NaN | NaN | NaN | |
| Absence of Written Procedures | 27.0 | 46.0 | 891.0 | 168.0 | NaN | 149.0 | |
| Accelerated stability studies | NaN | 1.0 | 25.0 | 5.0 | NaN | NaN |
## Product Type wise Long Description - Tobacco
byClassification = pd.DataFrame(merged_df.groupby('ProductType').LongDescription.value_counts())
byClassification.unstack().T.sort_values(by=['Tobacco'], ascending=False).head(15)
| ProductType | Biologics | Devices | Drugs | Food/Cosmetics | Tobacco | Veterinary | |
|---|---|---|---|---|---|---|---|
| LongDescription | |||||||
| LongDescription | Failure to properly store toxic cleaning compounds in a manner that protects against contamination of food. | NaN | NaN | 1.0 | 98.0 | 1.0 | 3.0 |
| Failure to provide sufficient space for storage of materials as necessary for the maintenance of sanitary operations and the production of safe food. | NaN | 1.0 | NaN | 732.0 | 1.0 | 7.0 | |
| Failure to remove litter and waste and cut weeds or grass that may constitute an attractant, breeding place, or harborage area for pests, within the immediate vicinity of the plant buildings or structures. | NaN | NaN | 1.0 | 202.0 | 1.0 | 2.0 | |
| A Master Record File providing the complete procedure for manufacturing a specific product is not checked , dated and signed or initialed by a qualified person. | NaN | NaN | NaN | NaN | NaN | 1.0 | |
| A Master Record File providing the complete procedure for manufacturing a specific product is not dated and signed or initialed by a qualified person. | NaN | NaN | NaN | NaN | NaN | 1.0 | |
| A Master Record File providing the complete procedure for manufacturing a specific product is not signed or initialed by a qualified person. | NaN | NaN | NaN | NaN | NaN | 2.0 | |
| A justification for not reporting the correction or removal action to FDA that included by a designated person was not included in the record. | NaN | 4.0 | NaN | NaN | NaN | NaN | |
| A justification for not reporting the correction or removal action to FDA that included conclusions , follow-ups and reviews by a designated person was not included in the record. | NaN | 1.0 | NaN | NaN | NaN | NaN | |
| A justification for not reporting the correction or removal action to FDA that included conclusions and follow-ups and reviews by a designated person was not included in the record. | 1.0 | 2.0 | NaN | NaN | NaN | NaN | |
| A justification for not reporting the correction or removal action to FDA that included conclusions and reviews by a designated person was not included in the record. | NaN | 9.0 | NaN | NaN | NaN | NaN | |
| A justification for not reporting the correction or removal action to FDA that included conclusions by a designated person was not included in the record. | NaN | 9.0 | NaN | NaN | NaN | NaN | |
| A justification for not reporting the correction or removal action to FDA that included conclusions, follow-ups and reviews by a designated person was not included in the record. | NaN | 44.0 | NaN | NaN | NaN | NaN | |
| A responsible person did not document and date the determination that an HCT/P is available for distribution. | 1.0 | 1.0 | NaN | NaN | NaN | NaN | |
| A responsible person did not document the determination that an HCT/P is available for distribution. | 2.0 | 2.0 | NaN | NaN | NaN | NaN | |
| A 5 day report was not submitted to FDA on Form 3500A within 5 workdays of becoming aware that a reportable MDR event necessitates remedial action to prevent an unreasonable risk of substantial harm to the public health. | NaN | 10.0 | NaN | NaN | NaN | NaN |
country_df = pd.DataFrame(merged_df[['Country/Area', 'FiscalYear']])
for i in country_df.index:
if country_df['Country/Area'][i] != "United States":
country_df['Country/Area'][i]="Foreign"
## Pie-Chart showing Domestic vs Foreign Inspection Count
plt.figure(figsize=(8,8))
country_df['Country/Area'].value_counts().plot(kind='pie', autopct='%.1f%%')
plt.title("Domestic/Foreign")
plt.show()
## Data pre-processing before generating visuals.
state_codes = {
'District of Columbia' : 'dc','Mississippi': 'MS', 'Oklahoma': 'OK',
'Delaware': 'DE', 'Minnesota': 'MN', 'Illinois': 'IL', 'Arkansas': 'AR',
'New Mexico': 'NM', 'Indiana': 'IN', 'Maryland': 'MD', 'Louisiana': 'LA',
'Idaho': 'ID', 'Wyoming': 'WY', 'Tennessee': 'TN', 'Arizona': 'AZ',
'Iowa': 'IA', 'Michigan': 'MI', 'Kansas': 'KS', 'Utah': 'UT',
'Virginia': 'VA', 'Oregon': 'OR', 'Connecticut': 'CT', 'Montana': 'MT',
'California': 'CA', 'Massachusetts': 'MA', 'West Virginia': 'WV',
'South Carolina': 'SC', 'New Hampshire': 'NH', 'Wisconsin': 'WI',
'Vermont': 'VT', 'Georgia': 'GA', 'North Dakota': 'ND',
'Pennsylvania': 'PA', 'Florida': 'FL', 'Alaska': 'AK', 'Kentucky': 'KY',
'Hawaii': 'HI', 'Nebraska': 'NE', 'Missouri': 'MO', 'Ohio': 'OH',
'Alabama': 'AL', 'Rhode Island': 'RI', 'South Dakota': 'SD',
'Colorado': 'CO', 'New Jersey': 'NJ', 'Washington': 'WA',
'North Carolina': 'NC', 'New York': 'NY', 'Texas': 'TX',
'Nevada': 'NV', 'Maine': 'ME', 'Puerto Rico': 'PR', 'Virgin Islands': 'VI',
'Northern Mariana Islands':'NI','Guam':'GM','American Samoa':'AS'
}
us_df = merged_df[merged_df['Country/Area'] == 'United States']
us_df['State Code'] = us_df['State'].apply(lambda x : state_codes[x])
us_df.drop_duplicates(subset=['InspectionID'])
| InspectionID | FEINumber | LegalName | InspectionEndDate | ProgramArea | Act/CFRNumber | ShortDescription | LongDescription | City | State | Zip | Country/Area | FiscalYear | PostedCitations | Classification | ProjectArea | ProductType | FMD-145Date | State Code | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1167299 | 3017202975 | PAVmed Inc. | 2022-04-15 | Devices | 21 CFR 820.100(b) | Documentation | Corrective and preventive action activities an... | Foxboro | Massachusetts | 02035 | United States | 2022 | Yes | Voluntary Action Indicated (VAI) | Compliance: Devices | Devices | - | MA |
| 4 | 1167002 | 3006463947 | Ximedica LLC | 2022-04-13 | Devices | 21 CFR 820.30(g) | Design validation - production units | The design was not validated under defined ope... | Providence | Rhode Island | 02907 | United States | 2022 | Yes | Voluntary Action Indicated (VAI) | Compliance: Devices | Devices | - | RI |
| 6 | 1167016 | 3009492128 | Exogenesis Corporation | 2022-04-12 | Devices | 21 CFR 820.70(c) | Environmental control Lack of or inadequate p... | Procedures to control environmental conditions... | Billerica | Massachusetts | 01821 | United States | 2022 | Yes | Voluntary Action Indicated (VAI) | Compliance: Devices | Devices | - | MA |
| 10 | 1167392 | 3004217244 | Trio Community Meals | 2022-04-11 | Foods | 21 CFR 123.6(b) | No HACCP plan | You do not have a written HACCP plan that outl... | Woburn | Massachusetts | 01801 | United States | 2022 | Yes | Voluntary Action Indicated (VAI) | Foodborne Biological Hazards | Food/Cosmetics | - | MA |
| 13 | 1166559 | 3006468808 | Metro-Pack, Inc. | 2022-04-07 | Foods | 21 CFR 111.15(d)(2) | Pest control measures | You did not take effective measures to exclude... | Newburgh | New York | 12550 | United States | 2022 | Yes | Voluntary Action Indicated (VAI) | Food Composition, Standards, Labeling and Econ | Food/Cosmetics | - | NY |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 312996 | 535282 | 2938449 | IF Holding II, LLC dba Initiative Foods, LLC | 2008-10-01 | Foods | 21 CFR 110.35(a) | Buildings/good repair | Failure to maintain buildings and physical fac... | Sanger | California | 93657 | United States | 2009 | Yes | No Action Indicated (NAI) | Food and Color Additives Petition Review | Food/Cosmetics | - | CA |
| 313012 | 538923 | 1180856 | A & M Bakery, Inc. | 2008-10-01 | Foods | 21 CFR 110.10(b)(3) | Not washed/sanitized when appropriate | Employees did not wash hands thoroughly in an ... | Clarksburg | West Virginia | 26301 | United States | 2009 | Yes | No Action Indicated (NAI) | Food Composition, Standards, Labeling and Econ | Food/Cosmetics | - | WV |
| 313036 | 540095 | 1222039 | JT Sea Products, Inc. | 2008-10-01 | Foods | 21 CFR 110.20(b)(4) | Drip and condensate | The plant is not constructed in such a manner ... | New Bedford | Massachusetts | 02740 | United States | 2009 | Yes | Voluntary Action Indicated (VAI) | Foodborne Biological Hazards | Food/Cosmetics | - | MA |
| 313040 | 540313 | 3000203414 | Avanti Foods, Inc. | 2008-10-01 | Foods | 21 CFR 123.6(b) | HACCP plan location | Your HACCP plan is not specific to the kind of... | Milwaukee | Wisconsin | 53214 | United States | 2009 | Yes | Voluntary Action Indicated (VAI) | Foodborne Biological Hazards | Food/Cosmetics | - | WI |
| 313044 | 541050 | 3003895195 | Little Lad's Bakery, Inc. | 2008-10-01 | Foods | 21 CFR 110.35(a) | Buildings/good repair | Failure to maintain physical facilities in rep... | Corinth | Maine | 04427 | United States | 2009 | Yes | Voluntary Action Indicated (VAI) | Foodborne Biological Hazards | Food/Cosmetics | - | ME |
49687 rows × 19 columns
## Map showing State-wise Inspection Count
fig = px.choropleth(us_df, locations='State Code', color=us_df['InspectionID'], color_continuous_scale='spectral_r', hover_name='State', locationmode='USA-states', labels={'InspectionID':'Inspection count'}, scope='usa')
fig.show()
## Number of Inspections Yearly
pd.DataFrame(merged_df['FiscalYear'].value_counts())
| FiscalYear | |
|---|---|
| 2011 | 37506 |
| 2012 | 33656 |
| 2010 | 31330 |
| 2013 | 27399 |
| 2014 | 26231 |
| 2015 | 24722 |
| 2017 | 24271 |
| 2009 | 23356 |
| 2016 | 22834 |
| 2018 | 20688 |
| 2019 | 19461 |
| 2020 | 10526 |
| 2021 | 7288 |
| 2022 | 3779 |
## Country-wise and Year-wise count of Inspections
byFiscalYear = pd.DataFrame(country_df.groupby('Country/Area').FiscalYear.value_counts())
byFiscalYear.unstack().T
| Country/Area | Foreign | United States | |
|---|---|---|---|
| FiscalYear | |||
| FiscalYear | 2009 | 1033 | 22323 |
| 2010 | 1431 | 29899 | |
| 2011 | 3083 | 34423 | |
| 2012 | 4395 | 29261 | |
| 2013 | 4366 | 23033 | |
| 2014 | 4436 | 21795 | |
| 2015 | 5114 | 19608 | |
| 2016 | 4655 | 18179 | |
| 2017 | 3954 | 20317 | |
| 2018 | 3329 | 17359 | |
| 2019 | 3461 | 16000 | |
| 2020 | 1824 | 8702 | |
| 2021 | 12 | 7276 | |
| 2022 | 83 | 3696 |
# changing the rc parameters and plotting a line plot
plt.rcParams['figure.figsize'] = [15, 5]
# plt.figure(figsize=(50,50))
byFiscalYear.unstack().T.plot(kind='line')
plt.show()
## State-wise and Year-wise count of Inspections
byFiscalYear = pd.DataFrame(merged_df.groupby('FiscalYear').State.value_counts())
byFiscalYear.unstack().T
| FiscalYear | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | 2021 | 2022 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| State | |||||||||||||||
| State | - | 1033.0 | 1431.0 | 3083.0 | 4395.0 | 4366.0 | 4436.0 | 5114.0 | 4655.0 | 3954.0 | 3329.0 | 3461.0 | 1824.0 | 12.0 | 83.0 |
| Alabama | 195.0 | 140.0 | 141.0 | 124.0 | 172.0 | 126.0 | 64.0 | 48.0 | 168.0 | 115.0 | 92.0 | 1.0 | 9.0 | 24.0 | |
| Alaska | 26.0 | 64.0 | 95.0 | 105.0 | 97.0 | 37.0 | 46.0 | 56.0 | 46.0 | 34.0 | 35.0 | 4.0 | NaN | NaN | |
| American Samoa | NaN | 32.0 | 5.0 | NaN | 2.0 | NaN | 4.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
| Arizona | 198.0 | 165.0 | 203.0 | 302.0 | 235.0 | 176.0 | 273.0 | 170.0 | 303.0 | 258.0 | 245.0 | 77.0 | 184.0 | 26.0 | |
| Arkansas | 51.0 | 109.0 | 84.0 | 199.0 | 140.0 | 103.0 | 88.0 | 63.0 | 132.0 | 51.0 | 81.0 | 13.0 | 13.0 | 2.0 | |
| California | 3048.0 | 4715.0 | 5182.0 | 3992.0 | 2894.0 | 2781.0 | 3322.0 | 2433.0 | 2330.0 | 2508.0 | 2311.0 | 1422.0 | 1616.0 | 765.0 | |
| Colorado | 605.0 | 911.0 | 947.0 | 734.0 | 568.0 | 504.0 | 483.0 | 378.0 | 507.0 | 381.0 | 334.0 | 260.0 | 104.0 | 55.0 | |
| Connecticut | 217.0 | 295.0 | 303.0 | 357.0 | 370.0 | 287.0 | 229.0 | 283.0 | 205.0 | 140.0 | 167.0 | 61.0 | 101.0 | 65.0 | |
| Delaware | 28.0 | 34.0 | 40.0 | 10.0 | 31.0 | 67.0 | 5.0 | 30.0 | 32.0 | 60.0 | 17.0 | 24.0 | 4.0 | 4.0 | |
| District of Columbia | 20.0 | 49.0 | 33.0 | 41.0 | 31.0 | 14.0 | 26.0 | 53.0 | 9.0 | 56.0 | 37.0 | 19.0 | 12.0 | 18.0 | |
| Florida | 2988.0 | 4308.0 | 4435.0 | 3719.0 | 3043.0 | 2698.0 | 1487.0 | 1727.0 | 2079.0 | 1775.0 | 1834.0 | 1029.0 | 564.0 | 208.0 | |
| Georgia | 423.0 | 787.0 | 771.0 | 527.0 | 459.0 | 435.0 | 316.0 | 430.0 | 504.0 | 351.0 | 286.0 | 195.0 | 105.0 | 29.0 | |
| Guam | NaN | NaN | 32.0 | NaN | NaN | 10.0 | 4.0 | NaN | 4.0 | 10.0 | NaN | NaN | NaN | NaN | |
| Hawaii | 39.0 | 135.0 | 147.0 | 219.0 | 96.0 | 114.0 | 125.0 | 94.0 | 83.0 | 74.0 | 87.0 | 33.0 | 22.0 | 8.0 | |
| Idaho | 97.0 | 172.0 | 226.0 | 144.0 | 184.0 | 117.0 | 96.0 | 60.0 | 129.0 | 51.0 | 77.0 | 45.0 | 50.0 | 6.0 | |
| Illinois | 1015.0 | 980.0 | 1510.0 | 1025.0 | 797.0 | 601.0 | 542.0 | 506.0 | 622.0 | 578.0 | 662.0 | 350.0 | 317.0 | 129.0 | |
| Indiana | 312.0 | 456.0 | 336.0 | 377.0 | 389.0 | 384.0 | 294.0 | 352.0 | 393.0 | 315.0 | 229.0 | 170.0 | 121.0 | 43.0 | |
| Iowa | 107.0 | 79.0 | 155.0 | 164.0 | 179.0 | 179.0 | 243.0 | 90.0 | 145.0 | 91.0 | 117.0 | 27.0 | 24.0 | 9.0 | |
| Kansas | 89.0 | 186.0 | 276.0 | 158.0 | 131.0 | 158.0 | 131.0 | 213.0 | 233.0 | 86.0 | 89.0 | 50.0 | 32.0 | 28.0 | |
| Kentucky | 238.0 | 400.0 | 548.0 | 401.0 | 287.0 | 336.0 | 277.0 | 326.0 | 431.0 | 262.0 | 139.0 | 42.0 | 93.0 | 41.0 | |
| Louisiana | 194.0 | 166.0 | 401.0 | 172.0 | 95.0 | 68.0 | 67.0 | 134.0 | 123.0 | 135.0 | 139.0 | 41.0 | 18.0 | 22.0 | |
| Maine | 179.0 | 143.0 | 187.0 | 482.0 | 199.0 | 292.0 | 199.0 | 129.0 | 138.0 | 131.0 | 141.0 | 32.0 | 32.0 | 19.0 | |
| Maryland | 333.0 | 465.0 | 357.0 | 424.0 | 371.0 | 379.0 | 206.0 | 352.0 | 369.0 | 208.0 | 186.0 | 221.0 | 65.0 | 18.0 | |
| Massachusetts | 702.0 | 890.0 | 1286.0 | 895.0 | 703.0 | 609.0 | 464.0 | 421.0 | 632.0 | 513.0 | 519.0 | 355.0 | 166.0 | 200.0 | |
| Michigan | 826.0 | 870.0 | 887.0 | 874.0 | 631.0 | 586.0 | 569.0 | 414.0 | 454.0 | 426.0 | 328.0 | 162.0 | 144.0 | 76.0 | |
| Minnesota | 433.0 | 576.0 | 631.0 | 541.0 | 433.0 | 358.0 | 328.0 | 282.0 | 453.0 | 360.0 | 320.0 | 148.0 | 167.0 | 89.0 | |
| Mississippi | 103.0 | 133.0 | 46.0 | 49.0 | 36.0 | 32.0 | 84.0 | 72.0 | 49.0 | 27.0 | 113.0 | 24.0 | 16.0 | 8.0 | |
| Missouri | 334.0 | 413.0 | 561.0 | 414.0 | 281.0 | 486.0 | 433.0 | 256.0 | 284.0 | 305.0 | 178.0 | 83.0 | 47.0 | 25.0 | |
| Montana | 135.0 | 71.0 | 64.0 | 82.0 | 70.0 | 92.0 | 27.0 | 31.0 | 19.0 | 31.0 | 20.0 | 14.0 | 1.0 | 2.0 | |
| Nebraska | 52.0 | 28.0 | 83.0 | 41.0 | 53.0 | 70.0 | 88.0 | 51.0 | 119.0 | 72.0 | 50.0 | 36.0 | 20.0 | 8.0 | |
| Nevada | 85.0 | 66.0 | 137.0 | 66.0 | 112.0 | 124.0 | 119.0 | 106.0 | 90.0 | 127.0 | 172.0 | 135.0 | 63.0 | 3.0 | |
| New Hampshire | 67.0 | 128.0 | 98.0 | 153.0 | 86.0 | 134.0 | 69.0 | 140.0 | 79.0 | 46.0 | 74.0 | 25.0 | 29.0 | 9.0 | |
| New Jersey | 860.0 | 931.0 | 1254.0 | 950.0 | 812.0 | 651.0 | 583.0 | 463.0 | 486.0 | 565.0 | 463.0 | 255.0 | 244.0 | 137.0 | |
| New Mexico | 165.0 | 140.0 | 249.0 | 79.0 | 101.0 | 175.0 | 66.0 | 107.0 | 143.0 | 96.0 | 30.0 | 38.0 | 49.0 | 11.0 | |
| New York | 1506.0 | 1895.0 | 2172.0 | 2262.0 | 1639.0 | 1871.0 | 1816.0 | 1462.0 | 1691.0 | 1510.0 | 1286.0 | 809.0 | 835.0 | 367.0 | |
| North Carolina | 997.0 | 899.0 | 503.0 | 502.0 | 342.0 | 311.0 | 310.0 | 352.0 | 393.0 | 393.0 | 453.0 | 251.0 | 123.0 | 80.0 | |
| North Dakota | 42.0 | 50.0 | 42.0 | 71.0 | 12.0 | 19.0 | 2.0 | 9.0 | 15.0 | 21.0 | 4.0 | 5.0 | 4.0 | NaN | |
| Ohio | 886.0 | 887.0 | 1270.0 | 826.0 | 620.0 | 881.0 | 903.0 | 921.0 | 1097.0 | 951.0 | 620.0 | 235.0 | 195.0 | 146.0 | |
| Oklahoma | 72.0 | 125.0 | 206.0 | 109.0 | 142.0 | 257.0 | 155.0 | 160.0 | 129.0 | 70.0 | 81.0 | 15.0 | 35.0 | 11.0 | |
| Oregon | 146.0 | 358.0 | 369.0 | 357.0 | 361.0 | 298.0 | 261.0 | 416.0 | 515.0 | 208.0 | 227.0 | 54.0 | 76.0 | 27.0 | |
| Pennsylvania | 596.0 | 969.0 | 924.0 | 846.0 | 683.0 | 740.0 | 876.0 | 690.0 | 745.0 | 552.0 | 597.0 | 258.0 | 160.0 | 82.0 | |
| Puerto Rico | 455.0 | 378.0 | 836.0 | 705.0 | 597.0 | 478.0 | 442.0 | 439.0 | 369.0 | 311.0 | 288.0 | 122.0 | 185.0 | 165.0 | |
| Rhode Island | 184.0 | 165.0 | 182.0 | 144.0 | 69.0 | 105.0 | 103.0 | 90.0 | 147.0 | 88.0 | 38.0 | 87.0 | 24.0 | 20.0 | |
| South Carolina | 260.0 | 223.0 | 119.0 | 155.0 | 82.0 | 207.0 | 126.0 | 181.0 | 183.0 | 131.0 | 141.0 | 72.0 | 22.0 | 41.0 | |
| South Dakota | NaN | 21.0 | 41.0 | 10.0 | 52.0 | 5.0 | 7.0 | 6.0 | NaN | 27.0 | 38.0 | 6.0 | 8.0 | 6.0 | |
| Tennessee | 335.0 | 323.0 | 384.0 | 402.0 | 460.0 | 380.0 | 471.0 | 435.0 | 381.0 | 335.0 | 240.0 | 105.0 | 72.0 | 30.0 | |
| Texas | 919.0 | 2157.0 | 2313.0 | 1781.0 | 1259.0 | 1021.0 | 746.0 | 855.0 | 880.0 | 869.0 | 806.0 | 686.0 | 597.0 | 253.0 | |
| Utah | 248.0 | 443.0 | 545.0 | 355.0 | 448.0 | 441.0 | 334.0 | 504.0 | 470.0 | 444.0 | 400.0 | 65.0 | 141.0 | 56.0 | |
| Vermont | 46.0 | 68.0 | 84.0 | 200.0 | 88.0 | 77.0 | 120.0 | 133.0 | 70.0 | 66.0 | 58.0 | 24.0 | 18.0 | 61.0 | |
| Virgin Islands | NaN | 6.0 | 29.0 | 55.0 | 14.0 | 32.0 | 6.0 | 19.0 | 31.0 | 4.0 | 15.0 | 4.0 | 4.0 | NaN | |
| Virginia | 261.0 | 351.0 | 448.0 | 329.0 | 371.0 | 298.0 | 551.0 | 316.0 | 250.0 | 240.0 | 334.0 | 95.0 | 69.0 | 67.0 | |
| Washington | 717.0 | 754.0 | 1137.0 | 1197.0 | 700.0 | 624.0 | 588.0 | 461.0 | 649.0 | 409.0 | 386.0 | 287.0 | 156.0 | 55.0 | |
| West Virginia | 165.0 | 96.0 | 74.0 | 48.0 | 81.0 | 42.0 | 39.0 | 18.0 | 36.0 | 48.0 | 13.0 | 6.0 | 12.0 | 7.0 | |
| Wisconsin | 313.0 | 674.0 | 999.0 | 1080.0 | 860.0 | 520.0 | 390.0 | 434.0 | 450.0 | 426.0 | 359.0 | 125.0 | 104.0 | 123.0 | |
| Wyoming | 11.0 | 20.0 | 36.0 | 7.0 | 65.0 | 5.0 | 5.0 | 8.0 | 23.0 | 18.0 | 44.0 | NaN | 4.0 | 12.0 |
# changing the rc parameters and plotting a line plot
plt.rcParams['figure.figsize'] = [30, 30]
# plt.figure(figsize=(50,50))
byFiscalYear.unstack().plot(kind='line')
plt.show()
print("Double Click on Graph to view it properly.")
Double Click on Graph to view it properly.
- Yearly -> Yes
- Domestic/Foreign (Yearly) -> Yes
- State-wise (Yearly) -> Yes
## Classification-wise product type
byClassification = pd.DataFrame(merged_df.groupby('Classification').ProductType.value_counts())
byClassification.unstack().T
| Classification | No Action Indicated (NAI) | Official Action Indicated (OAI) | Voluntary Action Indicated (VAI) | |
|---|---|---|---|---|
| ProductType | ||||
| ProductType | Biologics | 963.0 | 1248.0 | 9120.0 |
| Devices | 13202.0 | 15365.0 | 46527.0 | |
| Drugs | 951.0 | 9625.0 | 28562.0 | |
| Food/Cosmetics | 25980.0 | 24726.0 | 121348.0 | |
| Tobacco | 3.0 | NaN | NaN | |
| Veterinary | 1963.0 | 3413.0 | 10051.0 |
# changing the rc parameters and plotting a line plot
plt.rcParams['figure.figsize'] = [8, 8]
#plt.figure(figsize=(25,12))
byClassification.unstack().T.plot(kind='bar', stacked=True)
<AxesSubplot:xlabel='None,ProductType'>
## Classification-wise Project Area
byClassification = pd.DataFrame(merged_df.groupby('Classification').ProjectArea.value_counts())
byClassification.unstack().T
| Classification | No Action Indicated (NAI) | Official Action Indicated (OAI) | Voluntary Action Indicated (VAI) | |
|---|---|---|---|---|
| ProjectArea | ||||
| ProjectArea | Bioresearch Monitoring | 424.0 | 1745.0 | 7752.0 |
| Blood and Blood Products | 358.0 | 377.0 | 4918.0 | |
| Colors and Cosmetics Technology | 1278.0 | 266.0 | 1476.0 | |
| Compliance: Devices | 995.0 | 9438.0 | 30939.0 | |
| Drug Quality Assurance | 492.0 | 7604.0 | 22406.0 | |
| Food Composition, Standards, Labeling and Econ | 7783.0 | 8106.0 | 19329.0 | |
| Food and Color Additives Petition Review | 6466.0 | 273.0 | 1638.0 | |
| Foodborne Biological Hazards | 4092.0 | 15539.0 | 91886.0 | |
| Human Cellular, Tissue, and Gene Therapies | 395.0 | 720.0 | 3494.0 | |
| Molecular Biology and Natural Toxins | 17.0 | NaN | 11.0 | |
| Monitoring of Marketed Animal Drugs, Feed, and Devices | 1948.0 | 3333.0 | 9736.0 | |
| Over-the-Counter Drug Evaluation | 14.0 | 26.0 | 47.0 | |
| Pesticides and Chemical Contaminants | 6242.0 | 337.0 | 2841.0 | |
| Postmarket Assurance: Devices | 11498.0 | 4951.0 | 12124.0 | |
| Postmarket Surv. and Epidemiology | 38.0 | 5.0 | 118.0 | |
| Pre-Approval Eval. of Animal Drugs and Food Additives | 12.0 | 74.0 | 292.0 | |
| Prescription Drug Advertising and Labelling | 6.0 | NaN | 2.0 | |
| Radiation Control and Health Safety Act | 555.0 | 273.0 | 930.0 | |
| Technical Assistance: Food and Cosmetics | 102.0 | 205.0 | 4161.0 | |
| Tobacco Post-Market Activities | 3.0 | NaN | NaN | |
| Unapproved and Misbranded Drugs | 146.0 | 1101.0 | 1411.0 | |
| Vaccines and Allergenic Products | 198.0 | 4.0 | 97.0 |
# changing the rc parameters and plotting a line plot
plt.rcParams['figure.figsize'] = [300, 5]
#plt.figure(figsize=(25,12))
byClassification.unstack().plot(kind='pie', subplots=True)
print("Double click on the graphs to see them properly.")
Double click on the graphs to see them properly.
Product Type - Observations can see that some of the product type such as Food/Cosmetic has higher number of VAI (Voluntary Action Indicated) Classification.
Project Area - It can be seen from the observations that majority of Project Areas has VAI (Voluntary Action Indicated) while there are some outliers such as Tobacco Post-Market Activities, Pesticides and Chemical Contaminants, Prescription Drug Advertising and Labelling and Vaccines and Allergenic Products where NAI (No Action Indicated) is on higher end.
## State-wise and Year-wise count of Inspections
byClassification = pd.DataFrame(merged_df.groupby('Classification').State.value_counts())
byClassification.unstack().T
| Classification | No Action Indicated (NAI) | Official Action Indicated (OAI) | Voluntary Action Indicated (VAI) | |
|---|---|---|---|---|
| State | ||||
| State | - | 5656.0 | 6870.0 | 28650.0 |
| Alabama | 127.0 | 403.0 | 889.0 | |
| Alaska | 104.0 | 11.0 | 530.0 | |
| American Samoa | 24.0 | 13.0 | 6.0 | |
| Arizona | 355.0 | 419.0 | 2041.0 | |
| Arkansas | 82.0 | 347.0 | 700.0 | |
| California | 5332.0 | 7086.0 | 26901.0 | |
| Colorado | 765.0 | 1343.0 | 4663.0 | |
| Connecticut | 302.0 | 369.0 | 2409.0 | |
| Delaware | 7.0 | 36.0 | 343.0 | |
| District of Columbia | 22.0 | 47.0 | 349.0 | |
| Florida | 8600.0 | 5402.0 | 17892.0 | |
| Georgia | 822.0 | 916.0 | 3880.0 | |
| Guam | NaN | NaN | 60.0 | |
| Hawaii | 75.0 | 395.0 | 806.0 | |
| Idaho | 87.0 | 391.0 | 976.0 | |
| Illinois | 1119.0 | 1560.0 | 6955.0 | |
| Indiana | 508.0 | 1029.0 | 2634.0 | |
| Iowa | 204.0 | 303.0 | 1102.0 | |
| Kansas | 232.0 | 288.0 | 1340.0 | |
| Kentucky | 800.0 | 439.0 | 2582.0 | |
| Louisiana | 101.0 | 385.0 | 1289.0 | |
| Maine | 122.0 | 426.0 | 1755.0 | |
| Maryland | 416.0 | 506.0 | 3032.0 | |
| Massachusetts | 703.0 | 784.0 | 6868.0 | |
| Michigan | 904.0 | 1736.0 | 4607.0 | |
| Minnesota | 524.0 | 663.0 | 3932.0 | |
| Mississippi | 35.0 | 340.0 | 417.0 | |
| Missouri | 439.0 | 667.0 | 2994.0 | |
| Montana | 31.0 | 128.0 | 500.0 | |
| Nebraska | 91.0 | 173.0 | 507.0 | |
| Nevada | 223.0 | 327.0 | 855.0 | |
| New Hampshire | 97.0 | 130.0 | 910.0 | |
| New Jersey | 725.0 | 1941.0 | 5988.0 | |
| New Mexico | 101.0 | 214.0 | 1134.0 | |
| New York | 1654.0 | 3234.0 | 16233.0 | |
| North Carolina | 1264.0 | 732.0 | 3913.0 | |
| North Dakota | 52.0 | 53.0 | 191.0 | |
| Ohio | 2784.0 | 1416.0 | 6238.0 | |
| Oklahoma | 234.0 | 339.0 | 994.0 | |
| Oregon | 339.0 | 520.0 | 2814.0 | |
| Pennsylvania | 1012.0 | 1270.0 | 6436.0 | |
| Puerto Rico | 681.0 | 1055.0 | 4034.0 | |
| Rhode Island | 100.0 | 204.0 | 1142.0 | |
| South Carolina | 304.0 | 337.0 | 1302.0 | |
| South Dakota | 13.0 | 69.0 | 145.0 | |
| Tennessee | 584.0 | 1467.0 | 2302.0 | |
| Texas | 1817.0 | 3203.0 | 10122.0 | |
| Utah | 628.0 | 1116.0 | 3150.0 | |
| Vermont | 65.0 | 231.0 | 817.0 | |
| Virgin Islands | 16.0 | 23.0 | 180.0 | |
| Virginia | 408.0 | 426.0 | 3146.0 | |
| Washington | 757.0 | 1163.0 | 6200.0 | |
| West Virginia | 48.0 | 138.0 | 499.0 | |
| Wisconsin | 565.0 | 1218.0 | 5074.0 | |
| Wyoming | 2.0 | 76.0 | 180.0 |
# changing the rc parameters and plotting a line plot
plt.rcParams['figure.figsize'] = [600, 7]
#plt.figure(figsize=(25,12))
byClassification.unstack().plot(kind='pie', subplots=True)
print("Double click on the graphs to see them properly.")
Double click on the graphs to see them properly.
This can be results of number of inspections in that state. If the number of inspections are high then there are diverse classifications.